A parallel finite element surface fitting algorithm for data mining
نویسندگان
چکیده
A major task in data mining is to develop automatic techniques to process and to detect patterns in very large data sets. Multivariate regression techniques form the core of many data mining applications. A common assumption is that the multivariate data is well approximated by an additive model involving only first and second order interaction terms. In this case high-dimensional nonparametric regression is reduced to the determination of a couple set of first and second order interaction terms, that is the determination of a coupled set of curves and surfaces. Thin plate splines provide a very good method to determine an approximating surface. Obtaining standard thin plate splines requires the solution of a dense linear system of equations of order n, where n is the number of observations. For data mining applications the number of observations is often in the millions, so standard thin plate splines may not be practical. We have developed a finite element approximation of a spline that can handle data sizes with millions of records. The resolution of the finite element method can independently be chosen from the number of observations. The observation data can be read from a secondary storage once, and does not need to be stored in memory. In this paper, we discuss the parallel implementation of this method in an MPI environment.
منابع مشابه
Scalable parallel algorithms for surface fitting and data mining
This paper presents scalable parallel algorithms for high dimensional surface fitting and predictive modelling which are used in data mining applications. These algorithms are based on techniques like finite elements, thin plate splines, wavelets and additive models. They all consist of two steps: First, data is read from secondary storage and a linear system is assembled. Secondly, the linear ...
متن کاملParallelization of a finite element surface fitting algorithm for data mining
Amajor task in data mining is to develop automatic techniques to process and to detect patterns in very large data sets. An important data mining technique is multivariate regression, and an essential sub task is the estimation of interaction surfaces, i.e. the estimation of functions of two variables. Thin plate splines provide a very good method to determine an approximating surface. Obtainin...
متن کاملSpeeding up the Stress Analysis of Hollow Circular FGM Cylinders by Parallel Finite Element Method
In this article, a parallel computer program is implemented, based on Finite Element Method, to speed up the analysis of hollow circular cylinders, made from Functionally Graded Materials (FGMs). FGMs are inhomogeneous materials, which their composition gradually varies over volume. In parallel processing, an algorithm is first divided to independent tasks, which may use individual or shared da...
متن کاملExtended finite element simulation of crack propagation in cracked Brazilian disc
The cracked Brazilian disc (CBD) specimen is widely used in order to determine mode-I/II and mixed-mode fracture toughness of a rock medium. In this study, the stress intensity factor (SIF) on the crack-tip in this specimen is calculated for various geometrical crack conditions using the extended-finite element method (X-FEM). This method is based upon the finite element method (FEM). In this m...
متن کاملComprehensive Parametric Study for Design Improvement of a Low-Speed AFPMSG for Small Scale Wind-Turbines
In this paper, a comprehensive parametric analysis for an axial-flux permanent magnet synchronous generator (AFPMSG), designed to operate in a small-scale wind-power applications, is presented, and the condition for maximum efficiency, minimum weight and minimum cost is deduced. Then a Computer-Aided Design (CAD) procedure based on the results of parametric study is proposed. Matching between t...
متن کامل